On the Use of Default Parameter Settings in the Empirical Evaluation of Classification Algorithms
نویسندگان
چکیده
We demonstrate that, for a range of state-of-the-art machine learning algorithms, the differences in generalisation performance obtained using default parameter settings and using parameters tuned via cross-validation can be similar in magnitude to the differences in performance observed between state-of-the-art and uncompetitive learning systems. This means that fair and rigorous evaluation of new learning algorithms requires performance comparison against benchmark methods with best-practice model selection procedures, rather than using default parameter settings. We investigate the sensitivity of three key machine learning algorithms (support vector machine, random forest and rotation forest) to their default parameter settings, and provide guidance on determining sensible default parameter values for implementations of these algorithms. We also conduct an experimental comparison of these three algorithms on 121 classification problems and find that, perhaps surprisingly, rotation forest is significantly more accurate on average than both random forest and a support vector machine.
منابع مشابه
Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services
The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...
متن کاملKnowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services
The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...
متن کاملDependence of Default Probability and Recovery Rate in Structural Credit Risk Models: Empirical Evidence from Greece
The main idea of this paper is to study the dependence between the probability of default and the recovery rate on credit portfolio and to seek empirically this relationship. We examine the dependence between PD and RR by theoretical approach. For the empirically methodology, we use the bootstrapped quantile regression and the simultaneous quantile regression. These methods allow to determinate...
متن کاملEvaluation and Prediction of the Impact of Parasite Waves and Cell Phone Use by Pregnant Mothers on the Volume of Amniotic Fluid based on Data Mining Algorithms
Introduction: Nowadays, the effects of radiation and constant use of cell phones have led to some problems. These radiations cause disorders in different systems of human body and even in a growing fetus. The aim of this study was to find the effect of using cell phone and internet by pregnant women on the amount of amniotic fluid. Method: First, a questionnaire was designed and evaluated by o...
متن کاملMicro-classification of orchards and agricultural croplands by applying object based image analysis and fuzzy algorithms for estimating the area under cultivation
Remote sensing technology is one of the most efficient and innovative technologies for agricultural land use/cover mapping. In this regard, the object-based Image Analysis (OBIA) is known as a new method of satellite image processing which integrates spatial and spectral information for satellite image process. This approach make use of spectral, environmental, physical and geometrical characte...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1703.06777 شماره
صفحات -
تاریخ انتشار 2017